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Abstract. Onion routing is a scheme for anonymous communication 
that is designed for practical use. Until now, however, it has had no for- 
mal model and therefore no rigorous analysis of its anonymity guarantees. 
We give an IO-automata model of an onion-routing protocol and, under 
possibilistic definitions, characterize the situations in which anonymity 
and unlinkability are guaranteed. 
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1 Introduction 

Anonymity networks allow users to communicate while hiding their identities 
from one another and from third parties. We would like to design such networks 
with strong anonymity guarantees but without incurring high communication 
overhead or much added latency. Many designs have been proposed that meet 
these goals to varying degrees [1]. 

Of the many design proposals, onion routing [8] has had notable success in 
practice. Several implementations have been made [8,13,6], and there was a 
similar commercial system, Freedom [2]. As of September 2006, the most re- 
cent iteration of the basic design, Tor [6], consists of over 750 routers, each 
processing an average of lOOKB/s. Onion routing is a practical anonymity- 
network scheme with relatively low overhead and latency. It provides two-way, 
connection-based communication and does not require that the destination par- 
ticipate in the anonymity-network protocol. These features make it useful for 
anonymizing much of the communication that takes place over the Internet to- 
day, such as web browsing, chatting, and remote login. 

Many Tor users communicate with web-based businesses and financial ser- 
vices. Chaum [4] was the first to note that even the best ecash design fails to be 
anonymous if the network identifies the customer. Even if a client is not hidden 
from the service, e.g., she’s using ordinary credit cards, she may desire privacy 
from her network-service provider, which might be her employer or just an ISP 
that is not careful with logs of its users’ activities. Examples of the threat posed 
by both of these situations have been all too frequent in the news. Businesses 
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also make integral use of Tor to protect their commercial interests from com- 
petitors or to investigate the public offerings of their competitors without being 
observed. One vendor discovered by using Tor that its competitor had been 
offering a customized web site just for connections from the vendor’s IP address. 

Low latency and other performance characteristics of Tor can be demon- 
strated experimentally; anonymity-preserving properties cannot. Also, even with 
careful design, vulnerabilities can persist. The initial Tor authentication proto- 
col had a cryptographic-assumption flaw that left it open to man-in-the-middle 
attacks. The revised authentication protocol was then proven to be free of such 
flaws [7]. As Tor is increasingly relied upon for sensitive personal and business 
transactions, it is increasingly important to assure its users that their anonymity 
will be preserved. Long-established components of such assurance in system se- 
curity include a formal model, proving security guarantees in that model, and 
arguing that the model captures essential features of the deployed system. These 
are what we provide in this paper. 

An onion-routing network consists of a set of onion routers and clients. To 
send data, a client chooses a sequence of routers, called a circuit , and constructs 
the circuit using the routers’ public keys. During construction, a shared symmet- 
ric key is agreed upon with each router. Before sending data, these keys are used 
to encrypt each packet once for each router in the circuit and in the reverse of 
the order that the routers appear in the circuit. Each router uses its shared key 
to decrypt the data as it is sent down the circuit so it is fully decrypted at the 
end. Data flowing up to the client has a layer of encryption added by each onion 
router, all of which are removed by the client. The layered encryption helps hide 
the data contents and destination from all but the last router and the source 
from all but the first router. The multiple encryption and decryption also makes 
it harder for an observer to follow the path the data takes through the network. 

Anonymity has not yet been rigorously proven of onion routing. We thus 
propose a formal model of onion routing based on the Tor protocol and ana- 
lyze the anonymity it provides. Our model is expressed using 10 automata [9], 
which provide us with asynchronous computation and communication. We then 
suggest definitions of anonymity and unlinkability with respect to an adversary 
within this model. The adversary is local and active in that he controls only 
a subset of the routers but can perform any arbitrary computation with them. 
This is the adversary against which Tor was designed to protect [6] and is a good 
adversary model for the situation in which Tor is currently running. Finally, we 
provide necessary and sufficient conditions for anonymity and unlinkability to 
be provided to a user in our model. It should be noted that we do not analyze 
the validity of the cryptography used in the protocol and instead base our proofs 
on some reasonable assumptions about the cryptosystem. 

We only consider possibilistic anonymity here. An action by user u is consid- 
ered to be anonymous when there exists some system in which u doesn’t perform 
the action, and that system has an execution that is consistent with what the 
adversary sees. The actions for which we consider providing anonymity are send- 
ing messages, receiving messages, and communicating with a specific destination. 



More refined definitions of anonymity, [12,5], incorporate probability. We leave 
for future work applying such definitions to our system, which could be done by 
defining a probability measure over executions or initial states. Also, for simplic- 
ity, the model includes almost no concept of time. We do add circuit identifiers 
to mimic an attacker’s ability to do timing attacks: the observation of distinctive 
timing patterns in a stream of data, whether inherent or attacker-induced. There 
is no timestamp included with actions, though, so there is only an ordering on 
actions. 

The main result we show is that the adversary can determine a router in a 
given user’s circuit if and only if it controls an adjacent router, with some other 
minor conditions. In particular, the adversary can determine which user owns 
a circuit only when the adversary controls the first hop. The set of users which 
have an uncompromised first hop form a sender “anonymity set,” among which 
the adversary cannot distinguish. Similarly, the adversary can determine the last 
router of a circuit only when it controls it or the penultimate router. Such circuits 
provide receiver anonymity. Also, a user is “unlinkable” to his destination when 
he has receiver anonymity or his sender anonymity set includes another sender 
with a destination that is different or unknown to the adversary. 

The first-hop/last-hop attack is well-known [13], but we state it in full detail 
and show that, in a reasonable formal model, it is the only attack that an adver- 
sary can mount. Also, our results persist with or without some of the nontrivial 
design choices, such as multiple encryption and stream ciphers. This doesn’t im- 
ply that these features are unnecessary - they may make attacks more difficult 
in practice and meet other goals such as data integrity, but it does illuminate 
their effect on the security of the protocol. Finally, we present the first formal 
network model and protocol definition for onion routing, and give definitions for 
anonymity and unlinkability within that model. 


2 Related Work 

Numerous papers have informally discussed the security of the design of onion 
routing and related systems, as well as theoretical and experimentally demon- 
strated attacks. There have also been numerous formalizations of anonymous 
communication. However, formal analyses have primarily been of systems other 
than onion routing, e.g., DC nets and Crowds. (Cf. [1] for examples of all of 
these.) 

Recent papers have formalized systems similar to onion routing but without 
persistent circuits. Camenisch and Lysyanskaya [3] prove that the cryptography 
their protocol uses doesn’t leak any information to nodes in the path other than 
the previous and next nodes, but leave open what anonymity it provides. This 
question is answered in part by Mauw et al. [10], who formalize a similar con- 
nectionless protocol in an ACP-style process algebra and, under a possibilistic 
definition of anonymity, show that it provides sender and receiver anonymity 
against a global passive adversary. Cryptography is dealt with using high level 
assumptions, similar to our work. Their model and analysis has much in common 



with this paper, but it does differ in several important ways. First, the protocol 
they investigate is connectionless: each data “onion” stores the identities of the 
routers it will pass through. This is significantly different from onion routing, 
which is circuit-based. Second, the analysis is done with respect to a passive 
adversary, which exhibits only a subset of the behavior of an active adversary. 
Third, in their model agents choose destinations asynchronously and the ad- 
versary must take into account every onion he has seen when trying to break 
anonymity. In our model all agents choose a single destination, which gives the 
adversary more power. In some ways, our work extends theirs, and several of the 
differences noted here appear in [10] as suggestions for further research. 

3 Model 

3.1 Distributed system 

Our model of onion routing is based on 10 automata [9] . This formalism allows 
us to express an onion-routing protocol, model the network, and make precise 
the adversary’s capabilities. One of its benefits is that it models asynchronous 
computation and communication. Another is that every action is performed by 
a single agent, so the perspective of the adversary is fairly clear. 

We model onion routing as a fully connected asynchronous network of 10 
automata. The network is composed of FIFO channels. There is a set of users U 
and a set of routers R. Let N = U A R. The term agent refers to any element of 
N. It is possible that U P\R ^ 0. In this case, user and router automata exist on 
the same processor. We assume that the users all create circuits of a fixed length 
l. (In the current Tor network, l = 3.) Each router-and-user pair shares a set of 
secret keys; however, the router does not know which of its keys belong to which 
user. This separates, for now, key distribution from the rest of the protocol. We 
assume that all keys in the system are distinct. Let K be the keyspace. The 
triple (it, r, i) will refer to the zth key shared by user u and router r. 

Let P be the set of control messages, and P be the extension of P by en- 
cryption with up to l keys. The control messages will be tagged with a link 
identifier and circuit identifier when sent, so let the protocol message space be 
M = N + x N+ x P. We denote the encryption of p £ P using key k with {p}k, 
and the decryption with {p}_fc. For brevity, the multiply encrypted message 
{{p}fci}/o 2 will be denoted {p}k lt k 2 - Brackets will be used to indicate the list 
structure of a message (i.e. \pi,P 2 , ■ ■ •])■ 

The adversary in our system is a set of users and routers A C N. The 
adversary is active in the sense that the automata running on members of A are 
completely arbitrary. We call an agent a compromised if a £ A 


3.2 Automata 

We give the automata descriptions for the users and routers that are based on the 
Tor protocol [6]. We have simplified the protocol in several ways. In particular 



we do not perform key exchange, do not use a stream cipher, have each user 
construct exactly one circuit to one destination, do not include circuit teardowns, 
eliminate the final unencrypted message forward, and omit stream management 
and congestion control. We are also using circuit identifiers to mimic the effect 
of a timing attack. Section 4.7 discusses the effects of changing some of these 
features of our protocol. 

During the protocol each user u iteratively constructs a circuit to his des- 
tination. u begins by sending the message {CREATE}/^ to the first router, 
r i, on his circuit. The message is encrypted with a key, k\ , shared between 
u and 7'i, r\ identifies k\ by repeatedly trying to decrypt the message with 
each one of its keys until the result is a valid control message. It responds with 
the message CREATED. (Note that this is different from the implemented Tor 
protocol, in which the CREATE message would be encrypted with the pub- 
lic key for r\ rather than one of the shared keys it holds.) Given a partially- 
constructed circuit, u adds another router, r,, to the end by sending the mes- 
sage {[EXTEND, rj, {CREATE}^] ^ down the circuit. As the message 

gets forwarded down the circuit, each router decrypts it. rj_i performs the CRE- 
ATE steps described above, and then returns the message {EXTENDED}/^.,. 
Each router encrypts this message as it is sent back up the circuit. 

Link identifiers are used by adjacent routers on a circuit to differentiate 
messages on different circuits. They are unique to the adjacent pair. Circuit 
identifiers are also included with each message and identify the circuit it is 
traveling on. They are unique among all circuits. Circuit identifiers are not used 
in the actual Tor protocol, and their only purpose here is to represent the ability 
of an adversary to insert and/or detect timing patterns in the traffic along a 
circuit. This reflects in our model the very real threat of timing attacks [11]. 
It has the added advantages of making it clear when this power is used and of 
being easy to remove in future model adjustments. 

The user automaton’s state consists of the sequence of routers in its circuit, 
a number that identifies its circuit, and a number that indicates the state of 
its circuit. We consider the final router in the circuit to be the destination of 
the user. The user automaton runs two threads, one to extend a circuit that 
is called upon receipt of a message and the other to start circuit creation that 
is called at the beginning of execution. We express these in pseudocode rather 
than 10 automata, but note that the state changes in a particular branch occur 
simultaneously in the automaton. k(u, c, b) refers to the key used by user u with 
router Cb in the 6th position in circuit c. The automaton for user u appears in 
Automaton 1. 

The router automaton’s state is a set of keys and a table, T, with a row for 
each position the router holds in a circuit. Each row stores the previous and next 
hops in the circuit, identifying numbers for the incoming and outgoing links, and 
the associated key. There is only one thread and it is called upon receipt of a 
message. In the automaton for router r, we denote the smallest positive integer 
that is not being used on a link from r to q or from q to r as minid{T 1 q) . The 
automaton for router r appears in Automaton 2. 
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c £ {(n, . . . ,ri) £ R l \\/iVi ^ r; + i}; init: arbitrary > User’s circuit 

i £ N; init: random l> Circuit identifier 

b £ N; init: 0 > Next hop to build 

procedure Start 

SEND(ci, [i, 0, {CREATE} feKCjl) ]) 
b = 1 

end procedure 

procedure MESSAGE(mspJ) > msg £ M received from j £ N 

if j = ci then 
if b = 1 then 

if msg = [ i , 0, CREATED] then 
b H — f 

SEND(ci, [i,0, {[EXTEND, c b ,{CREATE} fc(UiC , i)) ]} i!(u!C , i ,_ 1) ,..., fe(UiC ,i ) ]) 

end if 

else if b < l then 

if msg = {*,0, {EXTENDED}fc(„ iC]i ,_ jfc („ )Ci i)] then 
b H — f 

SEND( Cl ,[i,0, {[EXTEND, c b ,{CREATE} fc(UiCii , ) ]} fc(UjCii ,_ 1)i ... ifc(UiCil) ]) 

end if 

else if b = l then 

if msg = [*,0, {EXTENDED} fc(UiCji ,_i),... !t .(„ iCi i)] then 
b + + 

end if 
end if 
end if 

end procedure 


3.3 System execution 

We use standard notions of execution and fairness. An execution is a possible 
run of the network given its initial state. Fairness for us means that any message 
an automaton wants to send will eventually be sent and every sent message is 
eventually received. 

We introduce the notion of a cryptographic execution. This is an execution 
in which no agent sends a control message encrypted with active keys it doesn’t 
possess before it receives that message. We restrict our attention to such exe- 
cutions, and must require our encryption operation to prevent an attacker from 
outputting a control message, with more than negligible probability, when it is 
encrypted with keys he doesn’t possess. This is reasonable because we can easily 
create a ciphertext space that is much larger than the rather limited control 
message space P. Note that this precludes the use of public key encryption to 
encrypt the packets because such messages can easily be constructed with the 
public keys of the routers. 

Definition 1 An execution is a sequence of states of an IO automaton alternat- 
ing with actions of the automaton. It begins with an initial state, and two con- 
secutive states are related by the automaton transition function and the action 



Automaton 2 Router r 

1 

keys C K, where \keys\ > \U\ • [~|~|; init: arbitrary t> Private keys 

2 

rcffxNxiixZx keys', init: 0 

> Routing table 

3 

procedure Message([7, n,p\, q) 

> [ i,n,p ] £ M received from q £ N 

4 

if [q, n, 0, — 1, k] £ T then 

> In link created, out link absent 

5 

if 3 s efl-r,6e pP = {[EXTEND, s, b]}k 

then 

6 

Send(s, [minid(T, s), 6]) 


7 

T = T-[q,n, 0, -1, k] + [q, n, s, 

-minid(T, s), fc] 

8 

end if 


9 

else if [s, m, q, —n, k\ £ T then 

0 In link created, out link initiated 

10 

if p = CREATED then 


11 

T = T — [s,m, q, —n, k ] + [s, m, q, 

n, k ] 

12 

Send(s, [i, m, {EXTENDED}*,]) 


13 

end if 


14 

else if 3 m >o [q, n, s, m, k] £ T then 

t> In and out links created 

15 

Send(s, \i,m, {p}_k]) 

> Forward message down the circuit 

16 

else if [s,m,q,n,k\ £ T) then 

0 In and out links created 

17 

Send(s, [i,m, {p}*,]) 

> Forward message up the circuit 

18 

else 


19 

if 3kekeysP = {CREATE}*, then 

> New link 

20 

T = T + [q,n, 0, —1, k] 


21 

SEND(g, [i, n, CREATED]) 


22 

end if 


23 

end if 


24 

end procedure 



between them. Every action must be enabled, meaning that the acting automaton 
must be in a state in which the action is possible at the point the action occurs. 

Definition 2 A finite execution is fair if there are no actions enabled in the 
final state. Call an infinite execution fair if every output action that is enabled 
in infinitely many states occurs infinitely often. 

Definition 3 An execution is cryptographic if an agent sends a message con- 
taining only when it possesses all keys k\,...,ki, or when for the 

largest j s.t. the agent does not possess kj, 1 < j < i, the agent has already 
received a message containing 

3.4 Distinguishability 

Definition 4 A configuration C : U — » {(ri, . . . ,n,n) £ R l x N + |V,:r.; ^ r; + 1} 
maps each user to the circuit and circuit identifier in his automaton state. 

The actions we want to be performed anonymously are closely related to the 
circuits the users try to construct during an execution. In our model, all messages 
are sent along links of a circuit; these messages are all circuit-creation messages 
and thus are entirely determined by the circuit, so the sender or receiver of a 



given message corresponds directly to the path of the circuit. Therefore, in order 
to prove that certain actions are performed anonymously in the network, we 
can just show that the adversary can never determine this circuit information. 
This is a possibilistic notion of of anonymity. We do this by identifying classes 
of adversary- indistinguishable configurations. 

Because i £ N only sees those messages sent to and from i, an execution of 
a configuration C may appear the same to i as a similar execution of another 
configuration D that only differs from C in parts of the circuits that are not 
adjacent to i and in circuit identifiers that i never sees. To be assured that 
i will never notice a difference, we would like this to be true for all possible 
executions of C . These are the fair cryptographic executions of C, and likewise 
the executions of D should be fair and cryptographic. We will say that these 
configurations are indistinguishable if, for any fair cryptographic execution of C, 
there exists a fair cryptographic execution of D that appears identical to i, i.e. 
in which i sends and receives what appear to be the same messages in the same 
order. 

Agent i’s power to distinguish among executions is weakened by encryption 
in two ways. First, we allow a permutation on keys to be applied to the keys 
of encrypted or decrypted messages in an execution. This permutation can map 
a key from any router other than i to any other key of any other router other 
than i, because i can only tell that it doesn’t hold these keys. It can map any 
key of i to any other key of i, because i doesn’t know for which users and circuit 
positions its keys will be used. Second, i cannot distinguish among messages en- 
crypted with a key he does not possess, so we allow a permutation to be applied 
to control messages that are encrypted with a key that is not shared with i. This 
second requirement must be justified by the computational intractability of dis- 
tinguishing between encrypted messages with more than a negligible probability 
in our cryptosystem. 

Definition 5 Let Da be a relation over configurations indicating which configu- 
rations are indistinguishable to AC N. For configurations C and C' , C ~d a C 
if for every fair cryptographic execution a of C , there exists some action sequence 
(3 s.t. the following conditions hold with C' as the initial state: 

1. Every action of ft is enabled, except possibly actions done by members of A. 

2. [3 is fair for all agents, except possibly those in A. 

3. (3 is cryptographic for all agents. 

4- Let E be the subset of permutations on the active keyspace U x Rx [|"| s.t. 
each element restricted to keys involving a € A is a permutation on those 
keys. We apply f £ E to the encryption of a message sequence by changing 
every list component {p}( u ,r,i) in the sequence to {p}^r u ,r,i)- 
Let LI be the subset of permutations on P s.t. for all ir £ II , n is a permuta- 
tion on the set {{p}k 1 ,...,k,}p&P> an d tt ({p}k 1 ,...,k i ,k a ) = 7 r ({A'}fc 1 ) when 

k a is shared by the adversary. We apply n £ II to a message sequence by 
changing every message ...,** in the message sequence to 7r({p}fe 1 ,...,/sJ- 



Then there must exist £ £ E and n £ 77 s.t. applying £ and it to the sub- 
sequence of a corresponding to actions of A yields the subsequence of (3 
corresponding to actions of A. 

If C ~d a C', we say that C is indistinguishable from C' to A. It is clear that 
an indistinguishability relation is reflexive and transitive. 

3.5 Anonymity and Unlinkability 

The sender in this model corresponds to the user of a circuit, the receiver to the 
last router of the circuit, and the messages we wish to communicate anonymously 
are just the circuit control messages. The circuit identifiers allow the adversary 
to link together all the messages initiated by a user and attribute them to a 
single source. (Recall that in our model, users open a single circuit to a unique 
destination at one time.) Therefore sender anonymity is provided to u if the 
adversary can’t determine which circuit identifier u is using. Similarly, receiver 
anonymity is provided to r for messages from u if the adversary can’t determine 
the destination of the circuit with u’s identifier. Also, unlinkability is provided 
to u and r if the adversary can’t determine u’s destination. 

Definition 6 User u has sender anonymity in configuration C with respect to 
adversary A if there exists some indistinguishable configuration C' in which u 
uses a different circuit identifier. 

Definition 7 Router r has receiver anonymity on user u ’s circuit, in configu- 
ration C, and with respect to adversary A, if there exists some indistinguishable 
configuration C' in which a user with u ’s circuit identifier, if one exists, has a 
destination other than r. 

Definition 8 User u and router r are unlinkable in configuration C if there is 
some indistinguishable configuration C in which the destination of u is not r. 

4 Indistinguishable Configurations 

Now we will show that sometimes the adversary cannot determine the path or 
identifier of a circuit. More specifically, an adversary can only determine which 
user or router occupies a given position in a circuit when the adversary controls 
it or a router adjacent to it on that circuit. Also, when the adversary controls 
no part of a circuit it cannot determine its identifier. 

In order to do this, we must show that, given a pair of configurations (C, C) 
that are indistinguishable by these criteria, for every execution of C there exists 
an execution of C' that appears identical to the adversary. To do this we will 
start with a fair cryptographic execution of C , describe how to transform it, and 
prove that this transformed sequence forms a fair, cryptographic, and indistin- 
guishable execution of C' . We will also show that a pair of configurations that 
are distinguishable by the described criteria allow no such transformation. 



4.1 Message Sequences 

To start, we observe that, in spite of the arbitrary actions of the adversary, 
the actions of the uncompromised users and routers in an execution are very 
structured. The protocol followed by the user and router automata defines a 
simple sequence of message sends and receives for every circuit. A user or router 
will only send a message as part of such a sequence. 

The user subsequence consists of messages between the user and the first 
router on its circuit. The user u in configuration C begins the sequence by 
sending a CREATE message to C\ (u), the first router in ids circuit in C; Cj (u) 
responds with a CREATED message. Then for the remaining l — 1 routers on 
the circuit, the user sends an EXTEND message and C\ (u) responds with an 
EXTENDED message. The user will only send a message as the next step in this 
sequence. Therefore we can take all actions in an execution by u and partition 
them into those that are part of this sequence and those that are not. Those 
that are not are “junk” receives that the adversary caused to be sent to u by 
not following the protocol. 

A router performs a similar sequence when it is added to a circuit. The 
sequence begins when router r router receives a CREATE message from agent 
n with the smallest unused link identifier at that point between r and n. r 
responds with a CREATED message. Then r receives an EXTEND message 
with the identity of the next router q and an enclosed CREATE message. It 
passes the enclosed message on to q , receives a CREATED message back, and 
sends an EXTENDED message to n. Then r forwards up or down the circuit any 
further messages received, r will only send messages as part of such a sequence. 
We can therefore partition all actions by r in an execution into sequences of this 
type, in addition to a sequence for “junk” receives that aren’t part of such a 
sequence and are a result of adversarial misbehavior. We will use the existence 
of such partitions of executions in our analysis. 

4.2 Indistinguishable Users 

Now we prove that an active adversary cannot determine which user creates a 
given circuit unless the first router on that circuit is controlled by the adversary 
or the owners of all the other circuits have been determined. That is, an adversary 
cannot distinguish between a configuration C and the configuration C that is 
identical to C except for two circuits with uncompromised first routers that 
are switched between the circuit owners. In order to do so, we must show that, 
for any fair cryptographic execution of C, there exists some action sequence of 
C' satisfying the indistinguishability requirements of Definition 5. To do so, we 
simply swap between the switched users the messages that pass between them 
and the first routers on their circuits and switch the encryption keys of these 
messages. 

Theorem 1 Let u,v be two distinct users s.t. neither they nor the first routers 
in their circuits are compromised (i.e., are in A). Let C be identical to C except 
the circuits of users u and v are switched. C is indistinguishable from C' to A. 



Proof Sketch : Let a be a fair cryptographic execution of C. To create a possible 
execution of C ' , first construct a! by replacing any message sent or received 
between u ( v ) and C\(u) (Ci(w)) in a with a message sent or received between 
v ( u ) and Ci(u) ( C\(v )). Then let £ be the permutation that sends u to v and 
v to u and other users to themselves. Create f3 by applying £ to the encryption 
keys of a'. 

To show that the actions in this sequence are enabled for uncompromised 
routers we observe that only message partitions for it, v, C\(u), and C\ (v) have 
been changed. These are modified so that they remain valid partitions,. To show 
that the execution is fair we observe that no “new” valid messages or sequences 
can appear. The transformed sequence is cryptographic because the key permu- 
tations and message changes are applied to the entire sequence and the original 
sequence a was cryptographic. The permutation needed to make (3 look like a 
to A is just the reverse of the key permutation used to create (3. □ 


4.3 Indistinguishable Routers 

Now we prove that an adversary cannot determine an uncompromised router 
on a given circuit unless it controls the previous or next router on that circuit. 
More formally, assume that the (i — l)st, ith, and (i + l)st routers of a user it’s 
circuit in some configuration C are not compromised. We will show that C is 
indistinguishable from configuration C where C is identical to C except the ith 
router of it’s circuit has been arbitrarily changed. The proof is similar to that of 
Theorem 1, although it is complicated by the fact that the identities of routers 
in a circuit are included in multiple ways in the circuit creation protocol. Given 
an execution of C, we identify those message that are part of the circuit creation 
sequence of the modified circuit and then change them to add a different router 
in the ith position. Then we show that, in the sense of Definition 5, from the 
adversary’s perspective this sequence is indistinguishable from the original and 
could be an execution of C’ . 

Theorem 2 Say there is some user u A s.t. u’s circuit in C contains three 
consecutive routers, rj_i, rj, rj+i ^ A. Let C' he equal to C, except rt is replaced 
with r\ in u’s circuit, where r[ (f A U {r,_i, ri+i}. C is indistinguishable from 
C to A. The same holds for uncompromised routers (rj,r.j + i) if they begin u’s 
circuit and are replaced with (r',rj+i), or (r,:_i,rj) if they end u’s circuit and 
are replaced with (rj_i,r'). 

Proof Sketch: Let a be some fair cryptographic execution of C. Let h(C(u),i) 
be the number of occurrences of the ith router in the circuit C(u) among the 
first i routers. We modify a in steps to create an indistinguishable sequence (3: 

1. Replace all messages of the form [EXTEND, n, {CREATE} u>r . h(C{u ) »)] with 
[EXTEND, r',{CREATE} U;r>(c , (tt)ii) ]. 

2. Consider the partitions of router rj_i’s actions that each form a prefix of 
the sequence adding r,_ i to u’s circuit as the (i — l)st router. Replace all 



messages in these partitions that are to and from r\ with the same messages 
to and from r'. Modify the link identifiers on these messages so that they 
are the smallest identifiers in use between r.j_i and r[ at that point in a. 
Increase link identifiers that are in use between and r[ to make room for 
these new connections and decrease link identifiers that are in use between 
r.j_ i and r* to fill in the holes created by the removed connections. Perform 
similar modifications for routers r.j and r, + i . 

3. Replace all keys of the form (it, r,;, h(C(u),i)) with ( u , r ', h(C'(u),i)). Incre- 
ment as necessary the third component of the encryption keys used between 
u and r' to take into account that r' appears once more in C'(u) than it 
does in C(u). Also decrement as necessary the third component of the keys 
used between u and 7y to take into account that ry appears once less in C'(u) 
than it does in C(u). 

The actions in the transformed sequence are enabled because we convert the 
partitions involving r.j to involve r\ instead, adjusting link and key numbering as 
needed to maintain global consistency. (3 is cryptographic because the key and 
message permutations used to create it are applied uniformly. The transformed 
execution is fair first because we modify partitions as a whole, so there are no 
partial unfinished sequences. Second, f3 is cryptographic, so no “junk” receives 
from the adversary could be valid messages in a transformed partition. An attack 
that under this reasoning can’t occur is that a compromised router a can’t send 
a valid EXTEND message directing the router r' at the end of ids partially- 
constructed circuit to create a link to a. Such a message, if a weren’t prohibited 
from sending it, only enables router action when r' is the router at the end of 
the circuit, since another router would be using a different key. This would leave 
r\ with an enabled action in (3. Finally, the required permutations to make (3 
appear like a to A are simply the reverse of those used to create (3 in the first 
place. □ 

4.4 Indistinguishable Identifiers 

Theorem 3 Say there is some uncompromised user u s.t. all routers in C(u) 
are uncompromised. Let C he a configuration that is identical to C, except that 
u uses a different circuit identifier. C' is indistinguishable from C to A. 

Proof. Let a be a fair cryptographic execution of C. To create (3, simply change 
every occurrence of w’s circuit identifier in C (C;+i(m)) to its identifier in C . (3 
is enabled, fair, and cryptographic for C' because no message containing Ci+ffu) 
gets sent to the adversary in a and the protocol itself ignores circuit identifiers 
except to forward them on. It appears the same to A for the same reason. 

4.5 Distinguishable Configurations 

The relation Da, when restricted to the transitive closure of pairs that are indis- 
tinguishable by Theorems 1-3, is symmetric, and therefore forms an equivalence 
relation. Let ~d a denote this relation. 



We can easily tell which configurations are in the same equivalence class using 
a function p that reduces a circuit to an identifier, the compromised positions, 
and the positions adjacent to compromised positions. For convenience, in the 
following we take cq to refer to u. 

Definition 9 Let p : U x N l x N + x V(N) — > N x V(N x N+) be: 

( (c;_|_i, { (r, i) G N x N + |ci = r A (cj_i G A V c, G A V c i+ i G A)}) 
p(u, c,A)=< if Cj G A for some i 

[ (0, 0) otherwise 

We overload this notation: p(C) refers to the multiset formed from the 
circuits of configuration C adjoined with their user and reduced by p, i.e., 
p{C) = {p{u,C{u),A)\u G [/}. Thus p captures the indistinguishable features of 
a configuration according to Theorems 1, 2, and 3. 

Now we show that the equivalence relation is in fact the entire indistinguisha- 
bility relation and that Theorems 1, 2, and 3 characterize which configurations 
are indistinguishable. 

Theorem 4 If C ~d a D then C ~d a D. 

Proof. We show the contrapositive. Suppose C and D are in different equivalence 
classes. Let the adversary run the automata prescribed by the protocol on the 
agents it controls. Let a and (3 be fair cryptographic executions of C and D 
respectively. 

Partition the adversary actions of a into subsequences that share the same 
circuit identifier. There is at most one such partition for each circuit. Circuit 
positions that are created in the same partition belong to the same circuit. In 
each partition the adversary can determine the absolute location of a circuit 
position filled by a given compromised agent a by counting the total number 
of messages it sees after the initial CREATE. Clearly A can also determine the 
agents that precede and succeed a on the circuit and the circuit identifier itself. 
Therefore A can determine the reduced circuit structure p(C) from a. 

The adversary can use /3 in the same way to determine p(D). It is easy to 
see that C ~d a D if and only if p(C) = p(D ), so p(C) yf p(D). Therefore A can 
always distinguish between C and D. 

4.6 Anonymity 

The configurations that provide sender anonymity, receiver anonymity, and un- 
linkability follow easily from Theorems 1, 2, 3, and 4. In the following, let u be 
a user, C be a configuration, r be a router, and A be the adversary. 

Corollary 1 u has sender anonymity in C with respect to A if and only if at 
least one of two cases holds. The first is that u and Ci(u) are uncompromised 
and there exists another user v u s.t. v and C\(v) are uncompromised. The 
second is that u and all C.fu) are uncompromised. □ 



Corollary 2 r has receiver anonymity on u ’s circuit in C with respect to A if 
and only if at least one of two cases holds. The first is that u, r, and C/_i(w) are 
uncompromised and there exists another router q ^ r s.t. q is uncompromised. 
The second is that u and all Ci{u ) are uncompromised. □ 

Corollary 3 u and r are unlinkable in C with respect to A if and only if at least 
one of two cases holds. The first is that u, r, and (7;_i(ii) are uncompromised, 
and there exists another router q ^ r s.t. q is uncompromised. For the second 
case it must be that u and Ci(it) are uncompromised and there exists another 
user v ^ u s.t. v and C i(i>) are uncompromised. Also, it must be that Ci(v) ^ r, 
or Ci-\(v ) and r are uncompromised and there exists another router q ^ r s.t. 
q is uncompromised. □ 


4.7 Model Changes 

We chose the described protocol to balance two goals. The first was to accurately 
model Tor. The second was to make it simple enough to be analyzed, and so that 
the main ideas of the analysis weren’t unnecessarily complicated. Our results are 
robust to changes of the protocol, however. We can make the protocol simpler by 
removing multiple encryption and the circuit identifiers without weakening the 
indistinguishability results. Single encryption does allow the adversary to easily 
link entries in his routers by sending messages along the circuit. This power 
is already available in our model from circuit identifiers, though. In the other 
direction, we can make it more complicated with a stream cipher and multiple 
circuits per user without weakening the distinguishability results. 

Stream ciphers are used in the Tor protocol and prevent signaling along a 
circuit using dummy messages. Sending such messages will throw off the counter 
by some routers on the circuit and the circuit will stop working. We can model 
a stream cipher by expressing the encryption of the itli message p with key k as 
{p}{k,i)i and allowing a different permutation to be applied for every pair (k,i). 
This can only increase the size of the configuration indistinguishability relation. 
However, the proof for the distinguishability of configurations only relies on the 
ability of the adversary to decrypt using his keys, count messages, and recognize 
the circuit identifier. Therefore it still holds when the model uses a stream cipher. 

Allowing users to create multiple circuits doesn’t weaken the adversary’s 
power to link together its circuit positions and determine their position, but the 
number of configurations that are consistent with this view does in some cases 
increase. Let users create an arbitrary number of circuits. The adversary can 
still link positions and count messages as before, so the adversary can distin- 
guish configurations C and D if p{C) ^ p(T>). However, Theorems 1-3 no longer 
identify all indistinguishable configurations, because a circuit with an unknown 
user can belong to any user, not just a user with an uncompromised first router. 
It can, however, be shown that the converse of Theorem 4 continues to hold if 
we replace “C ~d a D” with “p(C) = p(D). v 



5 Conclusions 


We have presented a model of onion routing and characterized when anonymity 
and unlinkability are provided. It is an asynchronous model using 10 automata, 
and the protocol is based on Tor. The adversary we analyze is local and active in 
the sense that he is allowed to run arbitrary automata but is limited to the view 
of a subset of users and routers that he controls. We show that the adversary 
can determine when his routers hold positions in the same circuit and where in 
the circuit they are located, and only this. Thus anonymity is generally provided 
when the first or last circuit router is uncompromised. 

Two directions for future work on modeling onion routing are improving the 
model and improving the analysis. A big missing piece in the current model is the 
lack of time. Timing attacks are successful in practice, and we have only approx- 
imated them in our model with circuit identifiers. Also, we have simplified the 
Tor protocol by omitting key exchange, circuit teardowns, the final unencrypted 
message forward, and stream management and congestion control. Adding some 
or all of these would bring the model closer to reality. Towards improving the 
analysis, we have made several assumptions about the cryptosystem but have 
not exhibited an encryption scheme for which they hold. Also, probabilities in 
both the behavior of the users and the operation of system should be added to 
the model and analyzed according to probabilistic definitions of anonymity. 
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